Is intelligibility still the main problem? a review of perceptual quality dimensions of synthetic speech
نویسندگان
چکیده
In this paper, we present a comparative overview of 9 studies on perceptual quality dimensions of synthetic speech. Different subjective assessment techniques have been used to evaluate the text-to-speech (TTS) stimuli in each of these tests: in a semantic differential, the test participants rate every stimulus on a given set of rating scales, while in a paired comparison test, the subjects rate the similarity of pairs of stimuli. Perceptual quality dimensions can be derived from the results of both test methods, either by performing a factor analysis or via multidimensional scaling. We show that even though the 9 tests differ in terms of used synthesizer types, stimulus duration, language, and quality assessment methods, the resulting perceptual quality dimensions can be linked to 5 universal quality dimensions of synthetic speech: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) disturbances, and (v) calmness.
منابع مشابه
Speech difficulties in Joubert syndrome
Introduction: "Joubert syndrome" was first introduced in1969. This syndrome is a rare genetic disease with autosomal dominantpattern. Hypotonia, ataxia and motor delay of the disease known as clinical manifestations. In the few reports of this syndrome, mostly functional and structural components studied and radiographic images such as speech and language developmental delay symptoms has been l...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملAssessing the Intelligibility and Quality of HMM-based Speech Synthesis with a Variable Degree of Articulation
This paper focuses on the assessment of both the intelligibility and the quality of speech when using a variable degree of articulation (hypo/hyperarticulation) in the framework of HMM-based speech synthesis. Intelligibility is evaluated when the synthesizer is working in adverse conditions. The adaptation of a neutral speech synthesizer to generate hypo and hyperarticulated speech is first per...
متن کاملEffects of intelligibility on working memory demand for speech perception.
Understanding low-intelligibility speech is effortful. In three experiments, we examined the effects of intelligibility on working memory (WM) demands imposed by perception of synthetic speech. In all three experiments, a primary speeded word recognition task was paired with a secondary WM-load task designed to vary the availability of WM capacity during speech perception. Speech intelligibilit...
متن کاملDiagnostic Evaluation of Synthetic Speech Using Speech Recognition
The paper presents experiments on the use of automatic speech recognition for diagnostic evaluation of synthetic speech. Our previous work on the topic showed a strong correlation between the subjective and objective evaluation (ITU-T Rec. P.862 PESQ) of the quality of synthetic speech. The main drawback of the approach was the need for original human (reference) recordings in one to one mappin...
متن کامل